AITopics | self-supervised pre-training

Collaborating Authors

self-supervised pre-training

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

CroCo: Self-Supervised Pre-training for 3D Vision Tasks by Cross-View Completion

Neural Information Processing SystemsDec-23-2025, 20:02:33 GMT

name change, self-supervised pre-training, vision task, (10 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

CroCo: Self-Supervised Pre-training for 3D Vision Tasks by Cross-View Completion

Neural Information Processing SystemsOct-9-2024, 23:08:33 GMT

Masked Image Modeling (MIM) has recently been established as a potent pre-training paradigm. A pretext task is constructed by masking patches in an input image, and this masked content is then predicted by a neural network using visible patches as sole input. This pre-training leads to state-of-the-art performance when finetuned for high-level semantic tasks, e.g. In this paper we instead seek to learn representations that transfer well to a wide variety of 3D vision and lower-level geometric downstream tasks, such as depth prediction or optical flow estimation. Inspired by MIM, we propose an unsupervised representation learning task trained from pairs of images showing the same scene from different viewpoints.

cross-view completion, self-supervised pre-training, vision task, (7 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Improving Self-supervised Pre-training using Accent-Specific Codebooks

Prabhu, Darshan, Gupta, Abhishek, Nitsure, Omkar, Jyothi, Preethi, Ganapathy, Sriram

arXiv.org Artificial IntelligenceJul-4-2024

Speech accents present a serious challenge to the performance of state-of-the-art end-to-end Automatic Speech Recognition (ASR) systems. Even with self-supervised learning and pre-training of ASR models, accent invariance is seldom achieved. In this work, we propose an accent-aware adaptation technique for self-supervised learning that introduces a trainable set of accent-specific codebooks to the self-supervised architecture. These learnable codebooks enable the model to capture accent specific information during pre-training, that is further refined during ASR finetuning. On the Mozilla Common Voice dataset, our proposed approach outperforms all other accent-adaptation approaches on both seen and unseen English accents, with up to 9% relative reduction in word error rate (WER).

codebook, proc, representation, (14 more...)

arXiv.org Artificial Intelligence

2407.03734

Country:

North America > Canada (0.05)
South America > Colombia > Meta Department > Villavicencio (0.04)
Oceania > New Zealand (0.04)
(6 more...)

Genre: Research Report (0.65)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

Add feedback

Self-supervised Pre-training of Text Recognizers

Kišš, Martin, Hradiš, Michal

arXiv.org Artificial IntelligenceMay-1-2024

In this paper, we investigate self-supervised pre-training methods for document text recognition. Nowadays, large unlabeled datasets can be collected for many research tasks, including text recognition, but it is costly to annotate them. Therefore, methods utilizing unlabeled data are researched. We study self-supervised pre-training methods based on masked label prediction using three different approaches -- Feature Quantization, VQ-VAE, and Post-Quantized AE. We also investigate joint-embedding approaches with VICReg and NT-Xent objectives, for which we propose an image shifting technique to prevent model collapse where it relies solely on positional encoding while completely ignoring the input image. We perform our experiments on historical handwritten (Bentham) and historical printed datasets mainly to investigate the benefits of the self-supervised pre-training techniques with different amounts of annotated target domain data. We use transfer learning as strong baselines. The evaluation shows that the self-supervised pre-training on data from the target domain is very effective, but it struggles to outperform transfer learning from closely related domains. This paper is one of the first researches exploring self-supervised pre-training in document text recognition, and we believe that it will become a cornerstone for future research in this area. We made our implementation of the investigated methods publicly available at https://github.com/DCGM/pero-pretraining.

dataset, self-supervised pre-training, text recognition, (14 more...)

arXiv.org Artificial Intelligence

2405.0042

Country: Europe > Czechia > South Moravian Region > Brno (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Self-Supervised Pre-Training for Precipitation Post-Processor

An, Sojung, Lee, Junha, Jang, Jiyeon, Na, Inchae, Park, Wooyeon, You, Sujeong

arXiv.org Artificial IntelligenceDec-10-2023

Obtaining a sufficient forecast lead time for local precipitation is essential in preventing hazardous weather events. Global warming-induced climate change increases the challenge of accurately predicting severe precipitation events, such as heavy rainfall. In this paper, we propose a deep learning-based precipitation post-processor for numerical weather prediction (NWP) models. The precipitation post-processor consists of (i) employing self-supervised pre-training, where the parameters of the encoder are pre-trained on the reconstruction of the masked variables of the atmospheric physics domain; and (ii) conducting transfer learning on precipitation segmentation tasks (the target domain) from the pre-trained encoder. In addition, we introduced a heuristic labeling approach to effectively train class-imbalanced datasets. Our experiments on precipitation correction for regional NWP show that the proposed method outperforms other approaches.

arxiv preprint arxiv, dataset, lead time, (11 more...)

arXiv.org Artificial Intelligence

2310.20187

Country:

Asia > South Korea > Seoul > Seoul (0.05)
Asia > North Korea (0.05)
Europe (0.04)
Asia > East Asia (0.04)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

A Knowledge-based Learning Framework for Self-supervised Pre-training Towards Enhanced Recognition of Biomedical Microscopy Images

Chen, Wei, Li, Chen, Chen, Dan, Luo, Xin

arXiv.org Artificial IntelligenceJan-12-2023

Self-supervised pre-training has become the priory choice to establish reliable neural networks for automated recognition of massive biomedical microscopy images, which are routinely annotation-free, without semantics, and without guarantee of quality. Note that this paradigm is still at its infancy and limited by closely related open issues: 1) how to learn robust representations in an unsupervised manner from unlabelled biomedical microscopy images of low diversity in samples? and 2) how to obtain the most significant representations demanded by a high-quality segmentation? Aiming at these issues, this study proposes a knowledge-based learning framework (TOWER) towards enhanced recognition of biomedical microscopy images, which works in three phases by synergizing contrastive learning and generative learning methods: 1) Sample Space Diversification: Reconstructive proxy tasks have been enabled to embed a priori knowledge with context highlighted to diversify the expanded sample space; 2) Enhanced Representation Learning: Informative noise-contrastive estimation loss regularizes the encoder to enhance representation learning of annotation-free images; 3) Correlated Optimization: Optimization operations in pre-training the encoder and the decoder have been correlated via image restoration from proxy tasks, targeting the need for semantic segmentation. Experiments have been conducted on public datasets of biomedical microscopy images against the state-of-the-art counterparts (e.g., SimCLR and BYOL), and results demonstrate that: TOWER statistically excels in all self-supervised methods, achieving a Dice improvement of 1.38 percentage points over SimCLR. TOWER also has potential in multi-modality medical image analysis and enables label-efficient semi-supervised learning, e.g., reducing the annotation cost by up to 99% in pathological classification.

artificial intelligence, machine learning, representation, (15 more...)

arXiv.org Artificial Intelligence

2211.14715

Country:

Asia > China > Hubei Province > Wuhan (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)
Europe > Netherlands (0.04)
Asia > China > Hunan Province (0.04)

Genre: Research Report > New Finding (0.49)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Self-Supervised Mental Disorder Classifiers via Time Reversal

Iqbal, Zafar, Mahmood, Usman, Fu, Zening, Plis, Sergey

arXiv.org Artificial IntelligenceNov-30-2022

Data scarcity is a notable problem, especially in the medical domain, due to patient data laws. Therefore, efficient Pre-Training techniques could help in combating this problem. In this paper, we demonstrate that a model trained on the time direction of functional neuro-imaging data could help in any downstream task, for example, classifying diseases from healthy controls in fMRI data. We train a Deep Neural Network on Independent components derived from fMRI data using the Independent component analysis (ICA) technique. It learns time direction in the ICA-based data. This pre-trained model is further trained to classify brain disorders in different datasets. Through various experiments, we have shown that learning time direction helps a model learn some causal relation in fMRI data that helps in faster convergence, and consequently, the model generalizes well in downstream classification tasks even with fewer data records.

artificial intelligence, deep learning, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2211.16398

Genre: Research Report (0.82)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.96)

Add feedback